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Abstract 

This  paper  develops  a  model  and  an  associated  estimation  procedure  to 
forecast  and  control  the  rate  of  sales  for  a  new  product.  A  repeat-purchase 
diffusion  model  is  developed,  incorporating  the  effect  of  marketing  variables 
as  well  as  a  word-of-mouth  effect.   Bayesian  estimation,  with  priors  developed 
from  past  products,  is  used  to  update  the  parameters  of  the  model.   The  pro- 
cedure, shown  to  predict  better  and  give  more  stable  parameter  estimates 
than  classical  procedures,  is  used  to  develop  marketing  policies  for  new 
product  introduction. 
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1.   Introduction 

Early  in  the  life  of  a  frequently  purchased  product,  there  is 
often    too  little  data  available  either  to  forecast  long  term  sales  accurately 
or  to  make  proper  marketing  decisions.   A  popular         procedure  (see 
Blattberg  and  Golanty  [4  ]  for  example)  is  to  make  direct  use  of  model 
parameters  from  other  similar  products. 

But  all  products  have  some  uniqueness:   how  should  experience  with 
similar  products  be  incorporated  into  an  estimation  and  control  procedure? 
Bayesian  analysis  (see  Raiffa  and  Schlaiffer  [15],  for  example)  was  developed 
to  incorporate  past  experience  in  a  systematic,  formal  way.  We  incorporate 
bayesian  estimation  for  the  purpose  of  forecasting  and  control  into  a 
repeat-purchase  model  where  a  word  of  mouth  effect  is  significant.   We  show 
that,  as  sales  data  become  available,  the  parameters  of  the  model  and  the 
marketing  policies  can  be  updated  in  a  bayesian  framework.   This  framework, 
incorporating  past  (pre-market)  information  with  the  data  about  the  specific 
product,  gives  stable  parameter  estimates  and  policy  guidelines. 
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2.   V.'ord  of  Mouth  in  New  Product  Diffusion 

In  many  product-marketing  situations,  the  impact  of  brand  promotional 
efforts  is  enhanced  by  a  "word-of-mouth"  effect  —  that  is,  by  the  recommen- 
dation of  the  brand  by  current  satisfied  users  to  potential  users.   Examples 
of  such  situations  are: 

•  satisfied  viewers  of  a  movie,  or  users  of  a  restaurant  or 
resort  recommending  it  to  their  friends, 

•  doctors  recommending  a  successful  new  drug  to  their  col- 
leagues, 

•  women  recommending  a  new  food  store  to  other  housewives. 

In  each  of  these  examples,  initial  users  are  attracted  by  some  marketing 
effort  —  advertising  or  sales  promotion.   Their  use,  ther\  enhances  the  impact 
of  that  effort  on  the  rest  of  the  potential  user  population. 

In  some  situations  it  might  be  desirable  actually  to  direct  some  of  the 
initial  marketing  effort  toward  "opinion  leaders,"  people  who  are  more  like- 
ly to  try  the  new  product  and  whose  subsequent  recommendations  will  carry  more 
weight  than  the  rest  of  the  target  population.  Arndt  [i]  for  example,  points  to 
the  importance  of  the  word-of-mouth  effect  in  developing  advertising  policies. 
Silk  &  Davis  [16]  review  the  literature  dealing  with  influence  processes  in  mar- 
keting situations,  and  stress  the  need  for  explicit  understanding  and  measure- 
ment of  these  effects.  Dodson  and  Muller  [6]  develop  a  general  mathematical  form- 
ulation for  new  product  diffusion  problems,  both  for  durable  and  non-durable 
goods.   They  focus  on  advertising  effects  as  well  as  word  of  mouth  effects 
(although  they  do  not  treat  issues  of  parameter  estimation  and  control). 

Thus,  it  appears  tliat  mathematical  models  of  such  marketing  situations 
should  explicitly  consider  the  interaction  between  marketing  expenditures  and 
word-of-mouth  effects,  in  the  development  of  policies. 


This  paper  hypothesizes  and  develops  an  estimation  and  control  procedure 
for  a  model  structure  that  explicitly  includes  the  word-of-mouth  effect.   For 
the  sake  of  definiteness  we  consider  the  marketing  of  an  ethical  drug,  aimed 
at  a  certain  specialty  class  of  doctors.   One  of  the  most  important  components 
of  the  marketing  mix  employed  by  pharmaceutical  companies  is  "detailing"  — 
i.e.,  personal  selling  by  a  force  of  "detailmen,"  who  visit  doctors  and  de- 
scribe the  portfolio  of  products  produced  by  their  company,  provide  free  sam- 
ples and  literature,  and  of  course,  attempt  to  combat  the  efforts  of  detail- 
men  from  competing  companies.   Surveys  performed  over  a  number  of  years  have 
indicated  that  pliysician.s  generally  perceive  detailmen  as  influential  sources 
of  information  (Bauer  and  Wortzel  [2  ]).   Other  components  of  the  marketing 
mix  include  medical  journal  and  magazine  advertising  and  direct  mail,  but  a 
smaller  portion  of  the  total  marketing  budget  is  devoted  to  these  components 
than  to  detailing. 

For  a  new  product,  the  impact  of  company  marketing  effort  is  augmented 
by  the  word-of-mouth  effect  that  occurs  when  doctors  first  prescribing  the 
product  find  it  satisfactory  and  recommend  it  to  their  colleagues.   A  clas- 
sical study  in  this  area  was  performed  by  Coleman,  Katz,   and  Menzel  [5]. 

One  of  the  problems  in  testing  such  models  is  that  data  on  word-of-mouth 
is  hard  to  collect,  and  is  usually  not  collected.   Therefore,  our  model  valid- 
ation has  to  be  indirect  in  nature  —  i.e.,  we  postulate  the  nature  of  the 
word-of-mouth  effect  and  then,  using  the  observed  data,  check  to  see  whether 
the  model  is  consistent  with  the  data.   Data  for  two  ethical  drugs  were  used 
to  demonstrate  the  use  of  the  model. 

The  heart  of  this  analysis  is  "trial  and  repeat"  model  structuring.   A 
number  of  re-purchase  models  have  been  developed;  the  most  popular  use  panel 
data  collected  at  the  test-market  stage  of  new  product  introduction  to  esti- 
mate long-term  rates.   (Fourt  and  Woodlock  [8],   Harfitt  and  Collins  [lA], 
and  Eskin  [7].)   Kalwani  and  Silk  [  9]  develop  some  interesting  insight  into 
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the  nature  of  repeat  purchase  estimation,  formalizing  some  of  Eskin's  [7  ] 
hypotheses.   All  these  models  are  descriptive  in  nature  though;  they  focus 
on  the  forecasting  issue,  not  the  decision  of  controlling  the  level  of  mar- 
ket-'^g  effort. 

As  noted  earlier,  Dodson  and  Muller  [  6  ]  do  incorporate  an  advertising 
variable  into  a  repeat  purchase  model,  but  give  no  insight  on  how  the  model 
might  be  calibrated  and  used  for  decision  making. 

To  our  knowledge,  there  is  no  model  or  procedure  available  that  focuses 
on  the  dynamic  updating  and  control  of  a  diffusion-type  process  in  a  market- 
ing context.   Our  objective  here  is  to  develop  and  demonstrate  the  use  of 
such  a  procedure. 

The  application  developed  here  explicitly  considers  only  the  detailing 
activity  on  behalf  of,  and  against  a  new  product,  and  the  interaction  of  this 
effort  with  the  word-of-mouth  effect.   Advertising  and  direct  mail  have  been 
left  out  to  simplify  the  exposition.  Normally  these  marketing  efforts  are 
highly  correlated  with  detailing  effort  so  that  not  much  information  is  lost 
by  considering  detailing  alone  in  the  model.   The  approach  here  differs  from  that 
developed  by  Montgomery,  Silk  and  Zaragoza  [13]  in  that  we  address  the  im- 
pact of  word-of-mouth  effects  in  the  context  of  developing  a  long-term  total 
detailing  strategy.   Montgomery  et  al  develop  a  more  detailed,  tactical  pro- 
cedure that  is  heavily  dependent  upon  managerial  judgment  for  calibration,  i.e., 
a  decision-calculus  approach  (Little  [lO]). 

In  the  context  of  development,  the  model  is  used  to  develop  "good"  de- 
tailing  policies.   We  call  them  "good"  rather  than  "optimal,"  because  they 
have  been  specified  to  be  prvifit  improving  as  well  as  easily  implementable 
in  the  total  detailing  context  rather  than  just  profit  maximizing.   Manage- 
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ment  has  to  allocate  detailmen's  time  across  a  variety  of  products;  there- 
fore a  policy  for  a  single  product  must  be  simple  enough  to  be  incorporated 
within  the  total  portfolio.   This,  we  believe,  precludes  policies  that  are 
highly  statu  and  time  dependent,  requiring  frequent  changes  in  effort  allo- 
cation.  A  policy  that  seems  to  fit  these  marketing  realities  is  of  a  pulsed 
type  —  i.e.,  a  short  period  of  high  effort  detailing  during  product  intro- 
duction followed  by  a  much  lower  "maintenance  level"  detailing  over  the  re- 
mainder of  the  planning  horizon. 

Managerial  use  of  the  model  in  the  context  of  a  new  product  presents 
some  novel  aspects.   Since  the  key  period  in  the  planning  horizon  occurs  at 
the  beginning,  when  there  is  no  marketing  data  on  the  product,  even  purely 
adaptive  estimation  of  parameter  values  cannot  be  advocated  as  a  model  cal- 
ibration strategy.   We  believe  that  the  appropriate  approach  is  to  model  a 
variety  of  products,  obtaining  the  model  parameters  for  each,  and  using  in- 
formation about  those  parameters  to  develop  a  prior  distribution  of  para- 
meter estimates  for  the  new  product.   These  estimates  are  used  to  develop 
initial  policy  decisions,  which  are  updated  as  sales  data  become  available. 
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3.   The  General  Model 

Consider  the  case  of  ethical  drug  adoption  where  there  are  N*  doctors 
in  the  prescribing  class  (psychiatrists  for  anti-depressants,  e.g.)  of  which 
N^JN*)  may  eventually  prescribe  the  drug.   We  observe  the  number  of  prescrip- 
tions which  we  assume  is  closely  related  to  the  number  of  prescribing  doctors. 
Note  that  our  model  derivation  assumes  linearity  in  this  relationship,  which 
is  not  the  case  in  general.   The  most  productive  doctors,  who  write  a  dispro- 
portionate number  of  prescriptions,  are  also  more  likely  to  be  early  adopters. 
This  is  critical  if  we  wish  to  make  inferences  about  the  true  value  of  N. 
Our  objectives,  however,  are  to  infer  (a)  the  time  path  of  product  sales  and 
(b)  develop  promotional  policies.   For  this  purpose  the  concept  of  an  "average" 
doctor  is  sufficient  as  operationalized  in  the  process  described  below. 

Figure  1  describes  the  process  we  wish  to  study,  which  in  its  most  com- 
plete form,  has  three  states,  (1)  never  prescribed,  (2)  prescribing,  and 
(3)  used  to  prescribe.   The  activities  affecting  the  various  flows  are  labelled. 
Here  we  have  a  trial  structure  (movement  from  state  1  to  state  2)  and  a  repeat 
structure  (remaining  in  state  2  or  movement  from  state  3  to  state  2). 

Note  that  early  in  the  life  of  the  drug,  the  flow  will  be  almost  entirely 
from  state  1  to  state  2.  Later  on,  the  flow  switches  to  a  state  2  and  state  3 
interchange. 

The  three  state  model,  however  complete,  has  too  many  parameters  for 
efficient  estimation  for  any  of  the  data  sets  we  have  examined.  We  therefore 
use  a  two-state  model  —  prescribing  vs.  not  prescribing  —  as  an  approximation. 
The  difference  in  the  detailing  effectiveness  between  the  "trial"  and  "repeat" 
portion  of  the  three  state  model  will  be  handled  in  parameter  estimation  by 
an  effectiveness  decay  factor,  f(t),  applied  to  the  coefficient  of  detailing 
for  the  new  drug. 
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Figure  1: 
Complete  Flow  Model  Describing  the  Process 
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Let 

C.(t)  =  number  of  doctors  at  t  ncrt  prescribing,  t=l,  2,  ... 

C„(t)  =  number  of  doctors  prescribing  the  drug  at  t. 

K  (t)  =  number  of  new  (i.e.,  initial)  prescriptions  observed  at  t. 

K„(t)  =  number  of  prescription  renewals  at  t. 

W  =  random  variable,  the  number  of  patients  actually  using  the 
drug  class  that  a  randomly  chosen  doctor  has. 

Note  that  although  the  model  is  structured  in  terms  of  the  number  of  pre- 
scribing doctors,  the  data  we  observe  are  the  number  of  prescriptions.   Hence, 
we  assume  that 

K,(t)  +  K^(t)   =   C„(t)  E(W) 


C^Ct)   =   (K^(t)  +  K2(t))/E(W) 

We  describe  the  flows  between  these  two  classes  of  doctors 
(1  =  not  prescribing  and  2  =  prescribing)  as  follows: 
The  flow  from  C   to  C   is  affected 

a)  by  level  of  detailing 

b)  by  word-of-mouth  effect  related  to  the  change  in  the 
number  of  prescribing  doctors.  .,, 

The  flow  from  C   to  C   is  affected  by 

a)  competitive  detailing. 

b)  possible  word  of  mouth  . 

Figure  2  describes  this  process. 
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Figure  2:   Simplified  Flow  Model  Describing  the  Process 


Let 


d(t)  =   competitive  detailing  level  at  t. 

d  (t)  =   level  of  detailing  at  t 

X,(C  (t)-C  (t-D)  =  word  of  mouth  effect 
X_(d(t))  =  competitive  detailing  effect 

f(t)  »  decay  factor  for  detailing  effect 

The  decay  factor  f(t)  allows  us  to  consider  the  possibility  that  the 
same  amount  of  detailing  effort  might  have  varying  effectiveness  at  different 
stages  in  the  life  of  the  drug.   Normally,  we  would  expect  f(0)  =  1  and 
f(t)  to  be  non-increasing  —  i.e.  as  the  product  becomes  more  established, 
detailing  becomes  less  effective,  unless  a  new  communication  strategy  is 
developed.  We  now  define 

X^(d(t),t)  =  Xj^(dCt)  •  f(t)) 

to  be  detailing  effectiveness.   This  formulation  is  similar  to  Little's  copy 
effectiveness  factor  in  BRANDAID  [ll  ]  .   An  alternative  formulation  —  making 
X.  itself  a  function  of  time  was  rejected  as  overly  complex.   For  ease  of 
notation,  we  will  use  the  term  X^(d(t))  to  refer  to  the  X  (d(t),t)  above. 
Now, 

(la)    C2(t+1)   =  C^Ct)  +  X^(d(t))  •  C^(t) 

+  X2(C2(t)  -  C2(t-1)) 


-  X3(d(t))  •  C^Ct) 


and 


(lb)    C^(t)  +  C^Ct)   =  N   for  all    t. 

Note    that    this   model-structure  handles    the   word-of-mouth    term  (^2^ 
in   a  different  way   than   the    interactions    in  most   diffusion  models.      The   advantage 
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2 
of  this  formulation  is  that  it  approximates  the  (N-X)X  =  NX-X  interaction 

term  used  in  Bass  [3]  and  other  formulations  by  a  time-based  difference 
(X(t)-X(t-l) ) ,  which  permits  a  negative  word-of-mouth  effect  (lost  sales)  due 
to  competitive  activity  or  to  bad  product  experience,  for  example.   This  model 
is  thus  symmetric  in  that  the  model  structure  handles  competitive  word-of- 
mouth  explicitly. 

This  model  does  have  several  important  simplifying  assumptions.   The  first 
is  that  N,  the  number  of  doctors  in  the  class,  is  assumed  fixed.   Mahajan  and 
Peterson  [12]  show  how  this  assumption  can  be  relaxed. 

The  second  assumption  is  that  all  doctors  are  in  the  same  class  (psychi- 
atrists versus  general  practitioners,  for  example).   It  is  not  difficult  to 
amend  the  model  to  eliminate  this  assumption  by  constructing  a  series  of 
parallel  processes,  such  as  that  in  Figure  2  for  each  class  of  doctors. 

A  third  assumption  is  that  detailing  effectiveness  is  not  related  to  the 
current  number  of  prescribing  doctors.   This  could  be  handled  in  the  model 
through  an  interaction  term  between  simple  detailing  effectiveness  and  the 
word-of-mouth  effect. 

These  modifications  are  beyond  the  scope  of  our  current  objectives  how- 
ever and  data  needed  to  attempt  such  extensions  are  not  available. 
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4.   Estimation  and  Validation 

In  the  previous  section,  we  proposed  a  model  structure  for  the  detailing 
decision.  Now  we  must  answer  two  questions: 

(a)  Is  the  model  good,  i.e.  does  it  perform  better  as  a  forecaster 
than  alternative,  naive  models? 

(b)  How  does  one  use  the  model  in  the  typical  new  product  situation 
when  either  no,  or  very  little  data  is  available  for  the  product? 

The  parameter  estimation  issues  involved  in  (a)  and  (b)  are  different 
because  in  validating  the  model  we  can  use  a  substantial  amount  of  historical 
data  on  a  product.   In  this  section  we  focus  on  (a).  We  propose  functional 
forms  for  the  responses  A.(  )  and  show  how  the  parameters  of  these  forms  are 
estimated  using  part  of  the  data  for  a  particular  product.   The  model  is  then 
used  to  forecast  sales  of  the  product.   Thus  sales  are  compared  to  actual  sales 
achieved.  Two  naive  models  —  one  a  polynomial  in  time  and  one  an  autoregres- 
sive  scheme  —  are  also  estimated  and  used  for  forecasting.   These  forecasts 
are  also  compared  to  actual  sales,  and  the  resulting  root  mean  square  errors 
are  used  to  test  the  validity  of  the  proposed  model.   The  issues  raised  in 
(b)  are  discussed  in  the  next  section. 

Consider  now  the  specification  of  functional  forms  for  our  response 
models.  Although  linear  response  functions  are  tempting  to  use  from  the 
estimation  viewpoint  they  are  clearly  unsatisfactory  for  policy  development 
purposes  since  they  imply  that  marketing  efforts  should  be  either  zero  or  as 
large  as  possible.  Non-linearity  of  response  for  determining  detailing  effort 
for  a  brand  is  essential. 

Consider  the  following  model  form.   It  is  a  simple  form  that  contains 
non-linearity. 
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Let 

•   X^(d(t))   =  a^d(t)  +  a^d^it) 

X^iC^it)   -  C2(t-1))   =  a^(C2(t)  -  C2(t-1)) 

X3(d(t))   =  a^dCt) 

Substituting  CAt)      =  N-C-Ct)  in  (la)  we  get 


(2)     C^Ct+I)  -  C^Ct)   =  A^(d(t))   (N-C^Ct)) 

-  X3(d(t))  •  C^it) 

=  NX^(d(t)) 

-  Xj(d(t))   C^Ct) 

+  X2(C2(t)  -  C2(t-1)) 

-  X3(d(t))  .  C^Ct)   , 

and  plugging  in  the  proposed  functional  forms  for  the  X. 's  we  obtain: 

(3)  C2(t+1)  -  C2(t)   =   (a^d(t)  +  a2d^(t))(N-C2(t)) 

-  a,d(t)C„(t)  +  a, (C-(t)  -  C,(t-1)) 


This  equation  contains  five  unknown  parameters:  a^ ,  a  ,  a.,  a  and  N,  with  N 
appearing  in  a  way  that  makes  it  impossible  to  use  conventional  linear  esti- 
mation procedures.   Direct  estimation  of  the  parameters  using  nonlinear 
estimation  methods  leads  to  unstable  results  due  to  multicollinearity,  present 
in  all  the  data  sets  we  examined.  However,  if  N  is  known,  then  (3)  becomes 
linear  in  its  parameters. 
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Thus,  estimation  is  simplified  if  we  can  develop  an  estimate  of  N.  We 
do  so  by  fitting  a  model  that  is  linear  in  the  response  to  detailing  and 
deriving  N  from  this  model.  The  linear  model  assumes: 


X^(  )  =  X^C  )   =  B, 


X^i   )     =  c 


and  that  BN  =  A. 

Thus,  equation  (2)  reduces  to 

(4) 


C2(t+1)  -  C2(t)  =  A  d-'Ct) 


-  B(d(t)+d(t))C2(t) 
+  C(C2(t)-C2(t-l)). 


We  estimate  the  parameters  A,  B,  and  C  using  ordinary  least  squares  and  esti- 
mate N  from  the  fact  that  N  =  A/B.   Since  the  estimates  of  A  and  B,  A  and  B 
respectively  are  approximately  bivariate  normal,  the  distribution  of  N  can  be 
developed  analytically;  however  it  is  simpler  to  obtain  this  distribution  by 
simulation  as  follows:   If  X  and  Y  are  independent  identically  distributed 
(0,1)  normal  random  variables  then  it  is  easy  to  show  that 

A  =  0,X  +  y      and 


B  = 


£-x.yA7-^ 
a     ^  2   a. 


)   +  y. 


are  distributed  as  bivariate  normal  with  mean  (p, ,p«)  and  covariance  matrix 

2 
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The  maximum  likelihood  estimate  of  N  is  the  mode  of  the  simulated  frequency 

distribution  of  A/B. 

Table  1  gives  the  key  pieces  of  data  for  two  cases  of  ethical  drugs  intro- 
d'jced  into  two  different  markets.   The  data,  obtained  through  a  cooperating 
firm  from  IMS  America,  has  been  disguised  by  multiplication  by  an  arbitrary 
constant,  to  protect  company  confidentiality.   Case  1  is  used  to  validate 
the  model  structure  and  Case  2'  to  illustrate  model  use.   The  parameter  esti- 
mates of  the  linear  model  are  shown  in  Table  2,  and  the  distribution  of  N  in 
Figure  3.  Table  3  shows  the  parameter  estimates  for  the  nonlinear  model, 
assuming  N  =  10,700,  the  maximum  likelihood  estimate,  using  only  the  first 
12  points  for  fitting. 

The  function  f(t)  was  modeled  as  f(t)  =  1,  t<12,   =  .6,  t>12.   This 
form,  consistent  with  historical  decay  patterns  in  the  market,  works  for  case 
1.   Several  alternatives  were  tried  (exponential  decay,  varying  times  for  shift, 
varying  levels  for  shift)  and  this  one  worked  adequately  both  in  terms  of  fit 
and  prediction.   Operationally,  more  historical  analysis  will  lead  to  greater 
confidence  in  an  appropriate  form  for  f(t). 

Table  4  shows  the  forecasts  obtained  using  the  nonlinear  model,  together 
with  the  actual  sales,  and  Figure  4  graphs  these  series.   The  forecasts  are 
excellent,  with  a  root  mean  square  error  of  43.86. 

Table  5  shows  the  parameter  estimates  of  a  third  order  polynomial  that 
was  fit  to  the  data  using  the  first  15  points,  and  Figure  5  the  resultant 
forecasts.  Similarly,  Table  5  and  Figure  6  show  the  parameter  estimates  of  a 
third  order  autcregressive  scheme  and  the  resulting  forecasts.  In  each  of  these 
cases,  the  order  of  the  model  was  selected  as  having  the  same  number  of  parameters 
as  our  model. 
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TABLE  1:    ANALYSIS  DATA 


Case  1 Case  2 


Competitive  Competitive 

Quarter    Detailing    Detailing    Sales    Detailing    Detailing    Sales 


1 

53 

308 

0 

84 

725 

0 

2 

47 

417 

65 

69 

846 

35 

3 

55 

383 

131 

91 

834 

70 

4 

57 

396 

213 

58 

1023 

111 

5 

53 

411 

302 

63 

953 

150 

6 

46 

417 

303 

67 

837 

173 

7 

56 

462 

410 

63 

924 

160 

8 

61 

467 

668 

84 

953 

168 

9 

44 

498 

610 

84 

736 

195 

10 

53 

488 

775 

81 

992 

223 

11 

51 

523 

797 

72 

776 

259 

12 

49 

581 

672 

81 

662 

307 

13 

44  ' 

611 

697 

67 

822 

348 

14 

43 

,  581 

829 

47 

1024 

397 

15 

40 

585 

803 

45 

989 

405 

16 

38 

493 

798 

47 

777 

442 

17 

41 

505 

764 

72 

992 

448 

18 

35 

516 

648 

65 

756 

501 

19 

32 

485 

746 

79 

852 

506 

20 

27 

444 

609 

66 

1103 

489 

21 

28 

463 

553 

80 

946 

530 

22 

26 

427 

505 

446 

23 

24 

466 

549 

■* 

458 

2A 

22 

472 

522 

471 
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Table  2: 
Parameters  of  the  Linear  Model, 
First  Set  of  Data 


Variable  Value 

A 

B 

C 

F(2;19)  =  522 
Corrected  R-Square  =  .98 


T-Stat 


0.800 

2.07 

7.49x10"^ 

2.21 

0.626 

4.03 
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e.0 
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Figure  3;   Simulated  Distribution  of  N 
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Table  3: 

Nonlinear  Model  Parameter  Estimates 


Coef 

Value 

T-Stat 

^1 

2.28x10"^ 

0.11 

32 

7.99x10"^ 

0.21 

^3 

7.19x10"^ 

1.17 

^4 

0.56 

2.06 

F(3;9)  :  168 

Corrected  R-Square  =  .98 


*  Using  MLE  for  N  =  10,700 
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Table  4: 

Forecasts  of  Sales  Data,  Case  1, 
Using  the  Nonlinear  Model* 


1 

2 

3 

4 

5 

6 

7 

8 

9 
10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 

Root  Mean  Square  Error  =  43.86 

*  Using  MLE  for  N  =  10,700 


Series 

Values 

Forecast 

0 

64 

133 

213 

272 

311 

410 

578 

718 

775 

582 

783 

785 

786 

793 

772 

796 

749 

775 

725 

730 

702 

667 

678 

604 

652 

557 

626 

537 

601 

536 

577 

536 

552 
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Figure  4 ; 
Forecasts  from  the  Nonlinear  Model  and  Actual  Sales,  Case  1 
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Table  5: 
Polynomial  Model  Coefficients 


Model:   X(t)   =  A  +  BT  +  CT^  +  DT 


Coef 

Value 

T-Stat 

A 

61.1 

0.32 

B 

-23.9 

-0.31 

C 

17.7 

1.94 

D 

-0.86 

-2.59 

F(3;9)   =  112 

Corrected  R-Square   =    .965 
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Figure  5: 
Forecasts  from  the  Polynomial  Model  vs.  Actual  Sales 
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Table  6: 

Third  Order  Autoregressive  Model  Parameter  Estimates 


Model:   X(t)   =  A  +  BX(t-l)  +  CX(t-2)  +  DX(t-3) 


Coef 

A 
B 
C 
D 


Value 


T-Stat 


64.90 

2.37 

1.99 

7.47 

-1.65 

-3.55 

0.60 

2.37 

F(3;8)  =  206.8 
Corrected  R-Square  =  .98 
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Figure  6; 
Forecasts  Using  the  Autoregressive  Model 
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The  forecasts  from  the  polynomial  model  are  obviously  unsatisfactory, 
since  they  become  negative.   The  autoregressive  model  does  better,  but  the 
RMS  error  in  this  case  is  295.89  (see  Table  7),  7  times  greater  than  the  RMSE 
for  our  model. 

Based  on  the  results  from  these  data,  we  have  some  confidence  in  the 
mode  1 , 
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Table  7: 

Forecasts  Using  Autoregressive  Model,  and  Actual  Sales 


Actual  Sales 

Autoregressive 
Model  Forecast 

1 

0 

NA 

2 

64 

NA 

3 

133 

NA 

4 

213 

NA 

5 

272 

NA 

6 

311 

NA 

7 

410 

NA 

8 

578 

NA 

9 

718 

NA 

10 

775 

NA 

11 

782 

NA 

12 

783 

NA 

13 

785 

NA 

14 

786 

NA 

15 

793 

801 

16 

796 

832 

17 

775 

869 

18 

730 

899 

19 

667 

919 

20 

604 

928 

21 

557 

934 

22 

537 

941 

23 

536 

952 

24 

536 

964 

W-IS  =  295.89 
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5.  Using  the  Model 

Now  we  turn  to  the  question,  how  does  one  use  the  model  in  the  typical 
new  product  situation  when  little  data  is  available?  The  proposed  procedure  is 
similar  to  that  developed  in  the  previous  section  —  i.e.  using  a  linear  model 
to  obtain  an  estimate  of  N,  and  then  estimating  the  parameters  of  the  nonlinear 
model.   The  only  differences  are  that 

(i)   a  small  number  of  data  points  (4  to  8)  are  employed  in  the  esti- 
mation, 
(ii)   priors  for  the  parameters,  A,  B,  C  and  a^ ,  ...,  a,  derived  from 
other  "similar"  products  and  modified,  if  necessary,  to  reflect 
unique  characteristics  of  the  product  class  are  used  together 
with  these  data  points  in  a  bayesian  procedure, 
(iii)  the  parameters  are  updated  as  more  data  becomes  available. 

The  use  of  this  procedure  assumes  that  the  structure  of  sales  growth  will  be 
similar  from  drug  class  to  drug  class,  although  the  target  population  might 
be  different. 

Thus  our  procedure  is  as  follows: 

(a)  Estimate  parameters  of  the  linear  model,  using  ordinary  least 
squares  or  bayesian  regression  (if  past  data  are  available). 

(b)  Derive  the  distribution  of  N  from  the  assumption  that  (A,B)  are 
bivariate  normal. 
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(c)  Pick  several  values  of  N,  N^...N, ,  from  the  distribution  derived 
in  (b),  and,  incorporating  prior  estimates  of  a^,...a,  from  pre- 
vious data,  develop  posterior  estimates  of  a^...a, . 

(d)  Develop  a  detailing  policy  to  maximize  expected  long-term  per 
period  profit  from  the  distribution  of  N. 

We  illustrate  this  procedure  in  this  and  the  next  sections,  using  Case  2  data 
from  Table  1. 

In  developing  a  prior  for  case  2',   parameters  from  case  1  on  the  entire 
data  stream  were  used,  modified  to  reflect  the  slower  diffusion  rate  expected 
for  drugs  in  this  (second)  class.   In  particular  -a  /2a     was  set  initially 
equal  to  90,  consistent  with  historical  detailing  levels  in  this  class.  The 
variance-covariance  matrix  was  used  directly  from  case  1.   Note  that  this 
assumes  that  the  two  drugs  have  identical  market  characteristics,  their 
covariances  differing  only  due  to  sampling  variation.   Greater  experience 
with  historical  cases  will  lead  to  more  realistic  priors;  as  we  will  see  this 
level  is  quite  close  to  optimal  even  after  adding  16  data  points.   The  updated 
(posterior)  coefficients  for  the  linear  model,  using  priors  from  case  1  and  the 
first  eight  data  points  are  given  in  Table  8. 


Table  8 ;  Posterior  Coefficients,  Linear  Model 
(Case  2) 

Updated  Coefficients 

(mean) t-values 

A           0.171  4.18 

B           3.17  X  lO"^  5.42 

C           0.805  27.12 


The  density  of  N,  obtained  by  simulation,  is  shown  in  Figure  7.  Figure  8 
shows  the  Bayesian  nonlinear  model  forecasts  based  on  the  first  eight  data 
points,  showing  the  forecasts  obtained  using  the  maximum  likelihood  estimate 
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Figure   7:      A  Simulated  Density  of   N,   Case    2 
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Figure  8: 

Case  2:  Baycsian  Estimate,  Using  First  8  Points 
Plus  Case  1  data  as  Prior,  Prediction  (and  Pre- 
diction Interval)  on  Rest  of  Data 
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Figure  9: 


Updated  Forecast,  Adding  8  More  Points 
Second  Data  Set,  Nonlinear  Model 
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Table  9; 


Posterior  Estimates  of  the  Coefficients 


a) 

After  8  points: 

• 

Variable 

Value 

t-stat 

^ 

3.41 'lO"^ 

15.5 

^2 

-1.82 -lO"^ 

-2.80 

^3 

I.II'IO"^ 

1.71 

a,. 

0.81 

4.50 

b)  After  6  additional  points: 
Variable 


Value 

t-stat 

3.40.10"^ 

17.4 

-1.83*10"^ 

-3:05 

I.IO'IO"^ 

2.00 

0.82 

4.82 
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of  N,  together  with  a  95%  prediction  interval.  Figure  9  shows  the  improved  fit 
and  prediction  as  6  more  points  are  added.   Table  9  summarizes  the  estimates 
cf  the  a.  for  the  models  used  to  produce  these  two  sets  of  forecasts.  We 
should  note  that  the  bayesian  procedure  actually  allows  us  to  use  the  five- 
parameter  nonlinear  model,  even  with  a  small  number  of  data  points  available. 
Without  the  bayesian  approach,  we  would  be  forced  to  run  the  linear  model  in 
the  early  part  of  the  life  of  the  product  —  highly  undesirable  both  from  a 
policy,  and  a  forecasting  viewpoint.  Figure  10  shows  the  forecasts  obtained 
with  8  data  points  using  the  linear  model,  A  comparison  with  Figure  8  indi- 
cates the  tremendous  improvement  made  possible  through  the  use  of  a  prior. 
Figure  11,  the  linear  model  with  priors,  does  better,  but  still  is  far  worse 
than  the  nonlinear  bayesian  model.  Table  10  summarizes  the  root  mean  square 
errors  obtained  for  each  of  the  three  forecasting  methods. 

Table  10 

Summary  of  Forecasting  Accuracy 

RMSE 
OLS  (Figure  10)  237.5 

Bayesian  Estimate b, 

Linear  Model  120.6 

(Figure  11) 

Bayesian  Estimate, 

Nonlinear  Model  31.1 

(Figure  8) 

Thus,  the  nonlinear  model  with  bayesian  estimates  predicts  best.  As 
a  forecasting  procedure,  it  seems  useful;  next  we  move  to  issues  of  policy 
development. 
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Figure  lO:  Case  2:  OLS  Estimation  Using  First  8  Points 
Prediction  of  Rest  of  Data  (Linear  Model) 
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Figure  11:  Case  2:   Bayesian  Estimate,  Using  First  8  Points 
Case  1  Data  as  Prior,  Prediction  of  Rest  of  Data  (Linear  Model) 
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f>'   Determination  and  Updating  of  Detailing  Policies 

In  principle,  the  profit  maximizing  policy  over  a  planning  horizon  T 
periods  long  can  be  obtained  by  solving  a  dynamic  programming  problem  with  one 
state  variable,  C   (see  equations  la-lc) .  Computation  of  such  a  policy  requires 
some  assumptions  about  competitive  detailing  activity  during  the  planning 
period,  but  these  assumptions  can  probably  be  made,  and  the  sensitivity  of 
the  policy  to -these  assumptions  examined. 

We  believe,  however,  that  this  approach  will  lead  to  policies  that  are 
complicated  to  implement  and  also  unrealistic,  as  follows: 

a.  Because  of  competitive  reasons  it  is  usually  desirable  to  drive 
the  market  share  of  the  new  product  up  as  quickly  as  possible,  and  then  to 
maintain  it  at  that  level.   As  will  be  shown  below,  this  would  imply  a  pulse 
of  detailing  activity  during  the  introductory  phase  of  the  detailing  cam- 
paign, followed  by  a  (perhaps)  reduced  "maintenance"  level  of  detailing  during 
the  life  of  the  product. 

b.  Product  management  is  dealing  with  a  portfolio  of  drugs,  all  of 
which  are  promoted  by  the  same  detailing  force.   Highly  time  dependent  poli- 
cies, calling  for  a  different  amount  of  effort  on  each  drug  in  each  period 
are  difficult  to  implement  or  control.   These  are  the  types  of  policies  that 
are  likely  to  be  yielded  by  a  dynamic  programming,  profit  maximization  form- 
ulation.  Assuming  a  sequence  of  new  product  introductions  by  the  company, 
an  approximate  "steady  state"  policy  for  the  detailing  force  would  be  to  de- 
vote a  certain  fraction  of  its  effort  to  new  products  and  the  balance  to 
"maintenance  detailing." 
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In  view  of  the  above  we  shall  develop  the  parameters  of  a  policy  of  the 
following  type:   "Drive  the  market  share  of  the  product  up  to  some  level  m, 
and  then  maintain  it  at  this  level." 

The  introductory  phase  goal  then  is  to  reach  a  desired  share  m  as  quick- 
ly as  possible.   We  can  operationalize  this  by  computing  policies  that  maxi- 
mize m  at  the  end  of  t  periods,  where  t  can  be  a  variable  to  be  selected  to 
provide  the  desired  m. 

Setting  t  =  1,  it  is  easy  to  show  from  equation  4  that  the  optimal  detailing 
level  d*  =-a  /2a„.   Because  the  objective  function  as  now  set  up  is  separable 
between  periods,  we  can  show  that  d^i  =-a  /2a  ,  j  =  1,2, ...t,  maximizes  m  ,  the 
market  share  at  the  end  of  t  periods.   Thus,  during  the  introductory  phase,  the 
detailing  level  should  be  maintained  at   -a  /2a   until  the  desired  or  target 
share  is  achieved.   In  order  to  compute  the  values  of  m  ,  assumptions  must  be 
made  regarding  competitive  detailing  levels. 

In  the  long  run,  a  reasonable  objective  is  to  maximize  steady  state  per 
period  profit.   For  a  fixed  N,  the  per-period  number  of  prescribing  doctors  is: 


C 


N(a^d  +  a^d^) 

2     =  2 

a-d  +  a  d  +  ad 


(assuming  (T  =  constant). 

Per  period  profit  is,  then: 


11  (N)  =  K  C   -  K  d 
s       o  /     i 


We  may  wish  to  choose  a  policy  d  that  maximizes  experted  profit,  as  follows: 

where  f(,(n)  refers  to  the  distribution  of  the  number  of  potential  prescribers, 
calculated  from  the  procedure  described  in  Section  5. 
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For  our  case,  the  values  of  K  and  K,  are  66  and  95,  respectively.  Table 
'  o      1 

11  gives  optimal  policies  as  a  function  of  N  and  the  expected  profit  associated 
with  the  different  policies.   The  optimal  policy  is  roughly  80-83  per  period. 

As  indicated  earlier,  a  short  run  policy  is  to  drive  the  share  up  as 

-^1 
fast  as  possible.   In  our  case,  this  is  done  by  setting  d  =  - —  ,  where  we  use 

the  posterior  estimates  of  a^  and  a^.   For  the  value  of  N  associated  with  the 

optimal  long  term  policy,  this  level  of  effort  is  93. 

If  we  assume  that  d  is  approximately  constant  and  d 

then  we  get  that  the  steady-state  share  is:        • 

a  d  +  ad 

C^/N ^ —     =   .47 

a^d  +  ad  +  a_d 

By  our  reasoning,  then,  the  suggested  policy  is  to  set  a  detailing  level  at  93 
until  a  share  of  about  .47  is  reached  and  then  back  down  to  around  80. 

One  of  the  powers  of  the  bayesian  approach  is  that  updating  of  policies 
is  natural  as  more  data  is  collected.  In  a  manner  identical  to  that  above,  the 
updated,  optimal  long  term  policy  after  6  more  points  are  available,  was  cal- 
culated as  79.8,  quite  close  to  the  one  calculated  previously.   In  this  case, 
even  after  6  additional  periods,  the  optimal  policy  remains  stable.   (In 
practice  updating  would  occur  each  time  new  data  were  received  from  the  field.) 
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Table  11;  Optimal  Policy  Development 


N. 

1 

P(N.) 

2000 

.001 

3500 

.002 

5000 

.007 

6500 

.030 

8000 

.120 

9500 

.225 

11000 

.265 

12500 

.191 

14000 

.099 

15500 

.040 

17000 

.013 

18500 

.006 

Optimal  Per  Period  De- 
tailing Level,  Given  N. 

47.6 
65.3 
75.6 
80.0 
82.6 
84.1 
84.6 
86.7 
87.2 
87.8 
88.3 
88.7 


Expected  Per 

Period  Profit ($1000's) 

1.58 
1.76 
1.86 
1.87 
1.87 
1.86 
1.84 
1.84 
1.84 
1.84 
1.83 
1.62 
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The  impact  of  the  form  of  the  f(t)  function  could  be  of  concern  here. 
Note,  however, 

(a)  the  short  term  policy  uses  f(t)  =  1,  so  that  policy  is  not 
affected  by  f(t)  at  all;  and 

(b)  the  steady  state  policy  is  affected  only  by  the  level  of  the 
shift.   If  the  level  (from  f(t)=l  to ,6  in  our  case)  is  biased, 
our  updating  procedure  will  compensate  for  the  bias  in  the  updated 
estimate  of  a^  and  a^.  This  occurs  in  the  results  reported  in 
Figure  10. 

Thus,  the  policy  development  aspect  of  the  procedure,  our  main  focus 
here,  is  relatively  insensitive  to  the  choice  of  f(t). 
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7  .   Detailing  Force  Implications 

In  the  last  section  we  showed  how  detailing  policies  can  be  computed  for 
a  single  product  line.   If  we  can  assume  that  the  different  product  lines  con- 
stituting the  portfolio  of  product  offerings  are  independent  of  one  another, 
then  portfolio  profit  maximization  can  be  achieved  by  selecting  the  optimal 
market  share  for  each  product  line  individually,  so  long  as  the  total  number 
of  detailers  required  does  not  exceed  the  available  force. 

In  general  however,  the  portfolio  maximization  problem,  given  a  fixed 
detailing  force  D  can  be  addressed  as  a  lagranglan  problem.   If  P   (m)  is 
the  profit  associated  with  the  1   product  line  in  period  t  with  a  steady 
state  share  m,  and  d   (m)  is  the  detailing  force  required  in  the  same  period 
(note  that  given  m  and  our  policy  as  in  the  previous  section,  d   can  take 
only  one  of  two  possible  values),  we  wish  to 

Maximize  [  ^it^"^ 

i.t 

subject  to        J  d   <  d. 

Detailing  manpower  and  detailing  cost  will  be  assumed  to  be  linearly 
related,  a  reasonable  assumption  given  that  some  detailing  will  always  be 
done.  Market  share  is  a  concave  function  of  detailing  activity  both  for 
the  introductory  phase  and  the  maintenance  phase,  as  is  Illustrated  in 
Figures  3  and  4.   Therefore  F   (m)  is  convex  in  d   .   This  implies  that 
solutions  to  the  lagranglan  problem 

(10)  X(d)   -  I  P.Jo)  -  IXAl   d-d) 

It        t   i 

will  be  unique.   In  addition  ^  will  provide  us  the  marginal  value  of  addi- 
tional detailers. 
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8.  Discussion  and  Conclusion 

This  paper  has  developed  an  approach  toward  modeling  and  controlling  a 
market  penetration  program  when  a  word-of-mouth  effect  is  present.  An  aspect 
of  the  procedure,  applicable  in  many  other  product  areas,  is  that  it  uses  a 
bayesian  procedure,  developed  on  other,  similar  products,  to  permit  parameter 
estimates  earlier  in  the  life  of  the  product.  This  updating  procedure  is  in 
marked  contrast  to  other  judgemental  methods  in  that  it: 

(1)  specifically,  systematically  accounts  for  information  available 
in  similar  product-areas,  and 

(2)  allows  for  updating  of  parameter  estimates  for  purposes  of  fore- 
casting and  control,  gradually  improving  the  estimates  as  data 
come  in. 

The  model  developed  here  forecasts  quite  well  in  the  test  demonstrated, 
and  the  bayesian  model  works  much  better  than  a  more  standard  procedure.   Most 
importantly,  the  model  allows  for  calculation  and  dynamic  updating  of  optimal 
marketing  policies  at  a  point  in  a  product's  life  when  sufficient  historical 
data  are  not  available  to  make  clear  "classical"  inferences. 

We  also  show  that  it  is  feasible  both  to  estimate  the  effect  of  market- 
ing variables  in  a  trial/repeat  framework  and  to  dynamically  update  the  derived 
policy,  A  modified  version  of  the  procedure  appears  applicable  to  a  variety 
of  similar  new  product  marketing  problems. 
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•APPENDIX:   A  NOTE  ON  THE  BAYESIAN  REGRESSION  PROCESS  WITH  A  NATURAL  CONJUGATE 


Suppose  our  regression  model  is: 

V.  =  X.  6  +  e. 
11     1 

Then,  the  density  of  y  is: 

1   ? 
f^  (Y/x^B.h)  =  (2n)-^  e-^^^^-\  ^)  h^ 

where  h  =  — 2 

a 

The  likelihood  of  a  sample  y, . . .y   is: 

i.    n 

hi,         ,   .    n     . 

I   i 

With  the  kernal  (in  matrix  notation)  of 

e""^  ^  (y-XB)"^  (y-XB)^  n-/2 

Let  b  be  the  solution  of  the  normal  equations: 
X'^'Xb  =  X-'-y 

If  h  is  known,  we  proceed  as  follows: 
Let  the  prior  of  B  be  normal  -N(b  ,  (Iin  )   ) 
(where  n   is  a  positive  definite  and  symmetric.   Note  that  n  below  is  X  X.) 


Multiplying  the  kernal  of  the  prior  with  the  kernal  of  the  likelihood  gives; 

'  +  T 
12- 


T  =  (y  -  XtS)"^  (y-X8)  +  (B-b^"^  n^B-b^^)  =  T,  +  T, 


After  some  algebra: 

11  T   11     11 
T^  =  (B-b^^)'  n^  (B-b^^) 

and 

„      _   .IT   ,1.1    ,       T  ,  IIT      11.  11 

T=b        hb+yy-b  nb 

wliere 

n        =  n+n^   =    x'x  +  n^ 


and 


b        =  (n      )         1  II      +   II    b 
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Now  6  only  appears  in  T   (clearly  in  normal  form)  so  the  kernal  of 
the  posterior  is  normal  =  N(b   , (hn   )   ). 

When  the  precision  (h)  is  not  known,  the  analysis  is  similar,  but  with 
a  noinal-gamma  (Studentized)   prior  and  posterior  densities. 

Here  we  need  to  let: 

p   =  rank  n      V  =  n-p 

p-"-   =  rank  n"""  and  y  =  -  (y  -  Xg)"^  (y  -  Xg) 

11      ,   11 
p   =  rank  n 

The  prior  joint  density  is: 

f(6,h/b\v\n\u^)  - 


-IhCB-b^)"^  n^B-b^,"^  -^hv-^y\4-v^-l 
;  h  e       h 


And  the  posterior  is: 

.-o  ,  ,^11   11  ^11   11, 
f(B,h/b   ,v   ,h  ,u      ) 

where 

11  _  1  ^    ,11    ,    11. -1  ,  1.1  ^  ,  . 
n   =n+n,  b   =(n)    (nb+  nb) 

V   =  V  +  V  +  p  and 

11    1,,  ,,  1  1    ^IT  1^1-   ^   ,   _^  uT  ^s    ^m   11^11. 
M   =  -11  [  (v  M  +  b  n  b  )   +   (vp  +  b  nb)  -  b    n   b   ] 

Note  here  that  the  marginal  densities  of  B  and  h,  respectively,  are: 

f(B/b,  h/v,p)  =  [v+CB-bT   (n/v)  (B-b)  ]' ^^^""^ 
(student) 

and 

f(h/v.M)    -  e"^^^^  h^-^ 
(gamma) 
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